9 research outputs found
Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars
The discovery of neural architectures from simple building blocks is a
long-standing goal of Neural Architecture Search (NAS). Hierarchical search
spaces are a promising step towards this goal but lack a unifying search space
design framework and typically only search over some limited aspect of
architectures. In this work, we introduce a unifying search space design
framework based on context-free grammars that can naturally and compactly
generate expressive hierarchical search spaces that are 100s of orders of
magnitude larger than common spaces from the literature. By enhancing and using
their properties, we effectively enable search over the complete architecture
and can foster regularity. Further, we propose an efficient hierarchical kernel
design for a Bayesian Optimization search strategy to efficiently search over
such huge spaces. We demonstrate the versatility of our search space design
framework and show that our search strategy can be superior to existing NAS
approaches. Code is available at
https://github.com/automl/hierarchical_nas_construction
Neural Architecture Search: Insights from 1000 Papers
In the past decade, advances in deep learning have resulted in breakthroughs
in a variety of areas, including computer vision, natural language
understanding, speech recognition, and reinforcement learning. Specialized,
high-performing neural architectures are crucial to the success of deep
learning in these areas. Neural architecture search (NAS), the process of
automating the design of neural architectures for a given task, is an
inevitable next step in automating machine learning and has already outpaced
the best human-designed architectures on many tasks. In the past few years,
research in NAS has been progressing rapidly, with over 1000 papers released
since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized
and comprehensive guide to neural architecture search. We give a taxonomy of
search spaces, algorithms, and speedup techniques, and we discuss resources
such as benchmarks, best practices, other surveys, and open-source libraries
Anaphora and coreference resolution : a review
Coreference resolution aims at resolving repeated references to an object in a document and forms a core component of natural language processing (NLP) research. When used as a component in the processing pipeline of other NLP fields like machine translation, sentiment analysis, paraphrase detection, and summarization, coreference resolution has a potential to highly improve accuracy. A direction of research closely related to coreference resolution is anaphora resolution. Existing literature is often ambiguous in its usage of these terms and often uses them interchangeably. Through this review article, we clarify the scope of these two tasks. We also carry out a detailed analysis of the datasets, evaluation metrics and research methods that have been adopted to tackle these NLP problems. This survey is motivated by the aim of providing readers with a clear understanding of what constitutes these two tasks in NLP research and their related issues.Agency for Science, Technology and Research (A*STAR)This research is supported by the Agency for Science, Technology and Research (A∗STAR) under its AME Programmatic Funding Scheme (Projects #A18A2b0046 and #A19E2b0098)
Generative Flows with Invertible Attentions
Flow-based generative models have shown excellent ability to explicitly learn
the probability density function of data via a sequence of invertible
transformations. Yet, modeling long-range dependencies over normalizing flows
remains understudied. To fill the gap, in this paper, we introduce two types of
invertible attention mechanisms for generative flow models. To be precise, we
propose map-based and scaled dot-product attention for unconditional and
conditional generative flow models. The key idea is to exploit split-based
attention mechanisms to learn the attention weights and input representations
on every two splits of flow feature maps. Our method provides invertible
attention modules with tractable Jacobian determinants, enabling seamless
integration of it at any positions of the flow-based models. The proposed
attention mechanism can model the global data dependencies, leading to more
comprehensive flow models. Evaluation on multiple generation tasks demonstrates
that the introduced attention flow idea results in efficient flow models and
compares favorably against the state-of-the-art unconditional and conditional
generative flow methods